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This paper derives fixed-order recursive Least-Squares (LS) algorithms that 
can be used in system identification and adaptive filtering applications such 
as spectral estimation, and speech analysis and synthesis. These algorithms 
solve the sliding-window and growing-memory covariance LS estimation prob- 
lems, and require less computation than both unnormalized and normalized 
versions of the computationally efficient order-recursive (lattice) covariance 
algorithms previously presented. The geometric or Hilbert space approach, 
originally introduced by Lee and Morf to solve the prewindowed LS problem, 
is used to systematically generate least-squares recursions. We show that 
combining subsets of these recursions results in prewindowed LS lattice and 
fixed-order (transversal) algorithms, and in sliding-window and growing- 
memory covariance lattice and transversal algorithms. The paper discusses 
both least-squares prediction and joint-process estimation. 

I. INTRODUCTION 

Computationally efficient recursive Least-Squares (LS) algorithms 
have recently attracted attention in applications such as adaptive 
equalization, 1-4 echo cancellation, 5 and speech analysis and synthesis 6,7 
because of their fast convergence properties when compared to older 
least-mean-square or gradient adaptation techniques. 8-10 Since the 
work on computationally efficient LS algorithms by Morf and others 
first appeared in Refs. 11 and 12, numerous papers have followed that 
produce computationally efficient algoritttms that solve different types 
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of autoregressive LS estimation problems. 13 " 17 In general, these algo- 
rithms fall into four categories: (1) prewindowed recursive LS, (2) 
sliding-window recursive LS, (3) growing-memory covariance recursive 
LS, and (4) nonrecursive LS algorithms. Each of the first three 
categories has two subcategories: fixed-order, or transversal, algo- 
rithms; and order-recursive, or lattice, algorithms. 

References 1 and 12 present a prewindowed LS algorithm that 
satisfies a transversal filter structure (fast Kalman algorithm). Sub- 
sequent Refs. 7, 18, and 19 describe prewindowed and growing-memory 
covariance LS algorithms that satisfy a lattice structure. Normalized 
prewindowed LS lattice algorithms that involve fewer recursions than 
the original unnormalized versions, and which have the important 
advantage that all internal variables are less than or equal to unity in 
magnitude are presented in the more recent Ref. 13. Reference 14 
extends the normalized lattice algorithms to solve the sliding-window 
and growing-memory covariance LS problems. The recursive algo- 
rithms mentioned so far require order N arithmetic operations per 
iteration to update the filter parameters, where N is the order of the 
filter. A computationally efficient order-recursive algorithm that 
solves the set of linear equations for the covariance LS prediction 
problem has been presented in Ref. 11, and extended to the joint- 
process-estimation case in Ref. 17. These algorithms require order iV 2 
operations to compute the LS prediction coefficients and are nonre- 
cursive in the sense that the solution generated at time interval i is 
not used to generate the solution at time interval i + 1. 

This paper attempts to unify and extend the previous work by (1) 
systematically generating all of the recursions needed to derive all of 
the previously mentioned algorithms, and (2) using these recursions 
to derive new recursive fixed-order sliding-window and growing-mem- 
ory covariance LS algorithms. These new algorithms solve directly for 
the prediction- or autogressive-model coefficients, and involve signif- 
icantly less computation than both the unnormalized and normalized 
versions of the order-recursive or covariance lattice algorithms pre- 
sented in Ref. 14. In addition, in some applications it may be advan- 
tageous to work directly with the prediction- or autogressive-model 
coefficients, rather than the set of reflection coefficients produced by 
lattice algorithms. The algorithms mentioned in the previous para- 
graph, along with the new ones derived here, are obtained by appro- 
priately arranging subsets of least-squares recursions. The geometric 
or Hilbert space approach originally used by Lee and Morf 20 to derive 
the prewindowed LS lattice algorithm is used to derive all of the basic 
least-squares recursions in a cohesive manner. In this paper, however, 
only scalar-valued data are considered. 

The next section defines the sliding-window and growing-memory 
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covariance LS problems to be solved. Then Section III reviews the 
geometric approach to LS estimation. Fundamental order and time 
updates for the least-squares projection operator are given in Section 
IV, with derivations in Appendix A. In Section V these projection 
updates systematically derive least-squares recursions. Section VI 
gives fixed-order covariance algorithms and Section VII extends the 
preceding discussion to the joint-process-estimation case. Appendix B 
lists subsets of recursions in Sections V and VII that constitute other 
LS algorithms. 

II. PROBLEM STATEMENT 

We start by defining a sequence of data values y , y>i, • • • , y„ where 
i is the current time index. A linear least-squares forward predictor of 
order n chooses the coefficients fj\„ to minimize 

(f(i\n) = £ [y m - £ fj\ny m -j) , (1) 

m-i' \ 7=1 / 

where i' to i is the time window of interest. The coefficients fj\ nt 1 < 
j < n, are called the forward-prediction coefficients. A linear least- 
squares backward predictor of order n chooses the backward-prediction 
coefficients bj\ nt 1 <j < n, to minimize 



tb(i\n) = 2 \y m -n - X 6/|nym-/+l) • (2) 

m-i 1 \ 7=1 / 

Minimization of (1) rather than (2) is generally desired for a given 
application. The forward and backward prediction problems stated 
above are closely related, however, and the LS algorithms to be 
presented use the backward prediction coefficients to solve for the 
forward prediction coefficients in a computationally efficient manner. 
• If, instead of estimating future values of the same process, we wish 
to estimate another related process {*,■}, the least-squares cost function 
becomes 

€ x (i\n) = t [x m - J Cj\ n y m _ j+l ) , (3) 

m=i' \ 7=1 / 

where tap coefficients Cj\ n replace the prediction coefficients fy„ and 
bj\ n . The cost function (3) is relevant to joint-process-estimation 
problems such as channel equalization and echo cancellation. In the 
case of channel equalization, y, is the ith sample of the channel output, 
and %i is the ith channel symbol. 

Setting the derivatives of the cost functions (1), (2), and (3) with 
respect to the prediction (tap) coefficients equal to zero results in the 
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following linear equations: 

i 

<*V_ u _i | „f(i | n) = £ yjYj-i\n, (4a) 



and 



where 



#IM|»b(i|») - S yj-nYi\ny ( 4b ) 



i 

*i',,l»c(i|n) = X Wim (4c) 



f T U» s [/l|n/2|» •'• fn\n], (5a) 

b r (i|n) s [6 1|n fe 2 ,„ ••• b n \ n ], (5b) 

c T (i|n) = [ci|„c 2 |„ ••• c„|„], (5c) 

yjf» - [yjyj-i • • • yj-n+i], (6) 

and the covariance matrix 

$i\i\n - X yj\nyf\n- (7) 



Suppose now that i' = 0, and that y is the first available data 
sample. The least-squares solutions for f , b, and c, obtained by solving 
(4), are undefined since they depend on the unspecified data values 
v — 1, y — 2, ■ • • , y-n- The simplest, and perhaps most popular, 
technique for circumventing this problem is to assume all data values 
y jt j < 0, are zero. The least-squares solutions resulting from this so- 
called prewindowed estimation are then well defined as long as the 
covariance matrix is nonsingular. In applications such as speech 
modelling, however, where estimates of the prediction coefficient 
vector f (i \ n) are desired given relatively few data samples, prewin- 
dowed estimation may result in undesirable edge effects from assuming 
data is zero outside a given window. For these types of applications, it 
is desirable to estimate the prediction coefficients without any as- 
sumptions concerning the data outside the time window of interest. 

Covariance least-squares estimation replaces the lower time limit i' 
in (1), (2), and (3) by n, so that only known data values are used to 
compute the LS prediction (tap) coefficients. The improved estimates 
so obtained are not without cost, however. The resulting covariance 
LS algorithms derived in this paper and elsewhere 7 involve more 
computation than prewindowed LS algorithms. Notice that at each 
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iteration i, the prediction coefficients are computed from i + 1 data 
values. Because the number of data samples entering the least-squares 
computation grows with time, this type of estimation has been called 
growing-memory covariance estimation. 16 

Finally, another windowing technique that has attracted attention 
recently is the sliding-window technique, in which the lower time limit 
i' in (1) and (2) is replaced by i - M + n + 1, and in (3) by i - N + 
n, where M is a predetermined constant. At each iteration the least- 
squares prediction coefficients are therefore computed from a fixed 
number (M) of data samples. Notice that data samples outside the 
time window i — M + 1 to i have no effect on the least-squares solution 
for f, b, and c at time i, i.e., they are totally forgotten. This is in 
contrast to more conventional exponential forgetting techniques that 
reduce the effects of past data samples in a more continuous fashion. 16 
The sliding window is therefore useful in applications where the 
autoregressive model changes abruptly with time, or where undesirable 
transients periodically affect the data samples. In the former case, 
when the model parameters change, the sliding window eventually 
discards data values corresponding to previous model parameters. In 
the latter case, the sliding window eventually discards corrupted data 
values. 

Computationally efficient recursive algorithms that solve the grow- 
ing-memory covariance and sliding-window LS estimation problems 
will be derived in Sections V through VII. The next section develops 
the necessary mathematical background by reviewing the geometric 
interpretation of linear least-squares estimation. 

III. MATHEMATICAL BACKGROUND 

Given two vectors X and Y having the same dimension i, the inner 
product of X and Y is defined to be 

(X, Y) = X r WY, (8) 

where W is some prespecified i X i weighting matrix. As an example, 
a typical weighting matrix is the exponential weighting matrix 

W, : = [1 w w 2 • • • 10*% (9) 

where I is the i x i identity matrix. For convenience, we will assume 
that W is the identity matrix. Modification of the results in this paper 
to the case where W is arbitrary is straightforward. The distance 
between two vectors X and Y with the same dimension is therefore 
the regular Euclidean distance, 

d(X, Y) - || Y - X I! - <Y - X, Y - X) 1/2 . (10) 

The (nth order) projection of a vector Y onto a subspace (or 
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manifold) M, which is spanned by the n vectors (Xi, X 2 , • • • , X„j, is 
denoted as P W Y. The orthogonal projection of Y onto M is defined as 

P M Y - Y - P M Y, (11) 

and is orthogonal to the subspace M. This implies that 

<X„ Y - P M Y) =0, for j = 1, ■ • • , n. (12) 

Since P M Y lies in M, there exist constants, or regression coefficients 
fit A, • • • , fn such that 

PmY = £ / ; X, = Sf, (13) 

where S = [X x . • ■ X„] and f r - [A • • • AL Using (12) and (13), it is 
easy to show that 

f =(S r S)- x S T X (14) 

and 

PmY = S(S T S)- 1 S r Y, (15) 

assuming S r S is nonsingular. 

The linear least-squares estimate of Y, based upon the vectors 
Xi, • • • , X„, is formed by choosing A, • • • , /„ such that 

uir-iY-s/Ai" (16) 

is minimized. Differentiating this quantity with respect to/; and setting 
the result equal to zero gives 

Y - l fjXj = P M Y, (17) 

and the vector of estimation errors, 

£ = Y - £ ffKj = PmY. (18) 

7=1 

We have identified the operator P as a least-squares projection. 

IV. PROJECTION-OPERATOR UPDATE FORMULAS 

In this section some fundamental relationships satisfied by the least- 
squares projection operator are presented. These projection updates 
fall into two main categories: order updates and time updates. Under 
time updates are two subcategories, forward and backward time up- 
dates. We point out in advance that a total of three projection-operator 
updates will be used throughout this paper: one order update, one 
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forward time update, and one backward time update. In addition, one 
forward and one backward time update for inner products will be 
needed. 

4. 1 Order updates 

Given two vectors, Y and X, and a linear space M spanned by the 
vectors Xi, X2, • ■ • , X„, all in R', suppose we wish to calculate the 
least-squares estimate of Y based upon the vectors Xi, • • • , X n and 
X. In particular, we wish to find coefficients a,, / — 1, • • • , n, and b 
such that || Y - (£"=i oj&j + °X) || 2 is minimized. From the discussion 
in the last section, we know that the least-squares estimate of Y is 



I ajXj + bX = P,m+x|Y, 



(19) 



where \M + X| denotes the space spanned by M and X. We can write 
the following orthogonal decomposition of the space {M + X}, 21 



\M + X) = M \Pi,Xl 



(20) 



By the Hilbert space projection theorem, 21 we have that for any vector 
Y 6 R', 

JW,Y = P M Y + P {P &Y. (21) 

Figure 1 illustrates this equation for the special case n = 1. The 
projection of Y onto the space spanned by two vectors Xi and X 2 is 
shown as the sum of the two projections P^Y and P\p% x 2 |Y. 




Fig. 1 — Decomposition of P| Xl +x,|Y. 
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Equation (21) constitutes a fundamental order update for the least- 
squares projection operator. The (n + l)st-order projection P|m+xi is 
expressed as the sum of the nth order projection P M and the first order 
projection P\p&l\. By subtracting both sides of (21) from Y, we obtain 
the following order update for the orthogonal projection operator P ± , 

PiiwiY = PhY - P, Pj4 x,Y. (22) 

4.2 Forward time updates 

The forward time updates derived in this section compute a least- 
squares projection at time i given the same least-squares projection at 
time i — l. These recursions, when combined with the order recursions 
in the last subsection, can be used to derive prewindowed LS algo- 
rithms. We first consider the following vectors X^,, and Y^,,, which are 
composed of data samples from time to to i, i.e., 

X J,, = [xi Xi-i • • • *J, (23a) 

and 

Yj,, = b,y.-i ■•■yj. (23b) 

For notational convenience, in this section only we will omit the lower 
time subscript on the data vectors and assume it to be io. Our objective 
is to compute the linear least-squares estimate of Y„ given X, in terms 
of a least-squares estimate that does not use the most recent value y,. 
With this in mind we define the unit vector 

uf=[10-0 0], (24) 

which has the same dimension as Y„ i.e., u, £ R H * W . Associated with 
u, is the space spanned by u„ or the space of most recent data values, 
denoted as £/,. Note that P Ut Yi = y,u,-. For notational convenience we 
define a tilde operator as follows, 

Y, - PbYi = [0 y,-! y,_ 2 ■ • . y^, yj, (25) 

i.e., Y, is the projection of Y z onto the subspace of past data values. 

The basic prediction problem is illustrated in Fig. 2, where Y, is a 
vector having its endpoint in back of the plane of the paper and X, 
has its endpoint in front of the plane of the paper. We are given the 
vector X„ from which the least-squares estimate of Y„ Px.Y,, is to be 
recursively obtained. At time i we therefore assume a regression 
coefficient a computed at time i — 1 (i.e., Px w Y»-i = aX,_i, or 
equivalents, P Xl Y, = aX,), which we wish to modify using the most 
recent data values y, and x,. Figure 2 therefore shows PxY, decomposed 
into the two vectors aX, and P X ,(Y, - aX,). Figure 3 illustrates the 
plane spanned by X„ X„ and [/,. Since ABC and ADE are similar 
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Fig. 2— Decomposition of P X ,Y<. 



triangles, 



AB AC 

AD ~ AE 



= a, 



(26) 



so that AC = aXi. Figure 4 attempts to include vectors not shown in 
Fig. 2 and again illustrates the decomposition of P X Y,. (Only the 
endpoint of Y, is in Fig. 4.) 

Assume now that the vector X, is replaced by the subspace M, 
spanned by the vectors Xi,„ X 2 ,„ • • • , X„,,. Let 



Si — [Xi,, X2,i • • • X n ,i] 



and 



(27) 
(28) 



Si — [Xi,, X 2 ,, • • • X n ,,]. 
We define the projection 

p Miu _ 1 Y i = s.isrs.rsrt, = s,f, (29) 

i.e., Pj4, M Yj lies in M„ but uses regression coefficients computed at 
time i - 1. Referring to Fig. 3, Px.^Y, = aX,. Appendix A shows that 



P M Y t = Pna^Yt + PmM*» PitYi)sec%, 



(30) 



where 



sin 2 ^= (UbPaPi) = WPmM 2 
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B D 

Fig. 3— Plane spanned by X, and X,. 



and 



<Y,- a X,) 




Fig. 4— Rotated view of Fig. 2. 

= (urs.ocsrs.o-^sru,), 



sec 2 0, = 



2fl • 



1 - sm'0, 



(31) 



(32) 



The variable 0, can be interpreted as the angle between the spaces 
spanned by the matrices of basis vectors S, and S,. Referring to Fig. 
3, the angle is given by 



in 2 = 1 - 



XII 2 z 2 



sin 



IIX.-H 2 ||X,-|| 2 ' 
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(33) 



and measures the unexpectedness of the data received at time i. Notice 
that (33) can be rewritten as (31), where M, and S, are replaced by X,-. 
We obtain the following time update for the orthogonal projection 
operator by subtracting both sides of (30) from Y„ 

PhYi = Pfc, H Y, - PmM** PkJt)i*Hi. (34) 

One more relation that will be useful in the following section is a 
recursive equation for the inner product (v„ Pa/Y,), where v, is an 
arbitrary vector in R'~' 0+1 . This recursion, which is derived in Appendix 
A, is 

<v,-, PhYi) = <v„ Pjtftfi + <u„ P^v.Xu,, PhJ i )eec t e it (35) 

where M, is the space spanned by S,. 

4.3 Backward time updates 

Consider again the data vectors X, and Y, defined by (23). Suppose 
we wish to compute the linear least-squares estimate of Y, given X, in 
terms of a least-squares estimate that does not use the most distant 
or past values y^ and x, . Clearly, this problem can be solved in exactly 
the same fashion as the time-update problem stated at the beginning 
of the last section. By turning the vectors Y, and X, upside down, and 
assuming that y^ and x^ are the most recent samples, one can solve 
this problem by using time updates already derived. The same argu- 
ment holds when X, is replaced by the subspace M, spanned by vectors 
Xi,„ X 2 ,„ • • • , X„,,. In this case we wish to calculate the projection 
PmYi in terms of a projection onto the space spanned by the matrix 
of basis vectors S, in which the bottom row has been replaced by zeros. 
This is in contrast to the previous time updates, which expressed 
P M Y, in terms of a projection onto the space spanned by S, in which 
the top row has been replaced by zeroes (i.e., M,). 

In analogy with the notation defined in the last section, we define 
the unit vector 



< = [0 • ■ • 1] e R«-*> + \ (36) 

panned by u, as U^. We also define the following 
r in analogy with the previous tilde operator, 

Y?=PiKY I = [y I .y 1 _ 1 . . . ^ +1 o) T . (37) 



Similarly, 

Sf= [XfcXfc ••• X*,]. (38) 

The projection of Y, onto M, using regression coefficients computed 
from S*is defined as 

p M^Ji = S 1 [SrSf]- 1 [Sf r Y,]. (39) 
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The regression coefficients that multiply the basis vectors of M, are 
in this case elements of the vector [Sf r Sf ]" l [Sf T Y,]. 

The derivation of (30) can be repeated with u, replaced by u^, tildes 
replaced by asterisks, and P Mj|iH replaced by Pm^ +1 to give the follow- 
ing projection decomposition, 

P M Yi = P MioUo+l Yi + PitPkiVLk, PfeY,>sec 2 ff, (40) 

where 

sin 2 0f = IIPm.-uJ 2 

= (ulSiHSTsd-HsluO 

= <U, , Pjypfc), (41) 

and 

sec 2 0f = - ^. (42) 

1 — surfl? 

Subtracting both sides of (40) from Y, gives 

PhYi = W^JT, - Pm.u^u^, PfcY,->sec»ff. (43) 

Finally, the following update for inner products is analogous to (35), 

<v„ PifYi) = <vf, PfoYf) + (u^, PbvdiUi,, Pj*?i)8ec 2 df. (44) 

This completes the presentation of projection-operator recursions 
needed to derive the least-squares recursions in Sections V and VII. 
All order updates for variables entering the least-squares algorithms 
to be presented can be derived from (22). Similarly, all forward and 
backward time updates for vectors entering these algorithms can be 
derived from (34) and (43), respectively, and all forward and backward 
time updates for inner products can be derived from (35) and (44), 
respectively. 

V. LEAST-SQUARES RECURSIONS 
5.1 Notation 

Referring to the definition (23), a shift operator z~ j is defined by 

z- i Yl i =[y H y H - x •••3V,]. (45) 

Equations (1) and (2) can now be rewritten as 

e/(l | n) = || Y w - £ h\nif*I**d II 2 (46a) 

;=i 

and 
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e b (i\n) = ||2-"Y, 0+n ,, - I bjmiz-^Yit+n.d II 2 , (46b) 

i=\ 

where i ' has been replaced by io + n. A matrix of shifted data vectors 
is denoted as 

EWl ») = [*"%>+».< *" # " 1 T%hm • • • •""IW, (47) 

where / < n. The space spanned by the columns of S^+ nt i(l, n), which 
is a subspace generated by past data values, is denoted as Af, 0+niJ (/, n). 
For notational convenience we will omit the lower time index of S and 
M and assume that it is always io + n. Notice that we can write the 
covariance matrix defined by (7) as 

«Vn,m = SftO, n - 1)S,(0, n - 1). (48) 

Two types of updates exist for least-squares parameters: order 
updates and time updates. The time updates in this section generally 
fall into two categories. Given some LS parameter £ (i.e., the forward 
prediction vector f or the prediction residual), we wish to find (1) a 
recursion for £ computed from the data samples {y, , y, +i, • • • , y,} in 
terms of £ computed from the data samples \y^, y^+i, • • • , y,-i} (for- 
ward time update), and (2) a recursion for £ computed from the data 
samples [y^, y^+i, ■■• ,y t ) in terms of £ computed from the data 
samples fy^+i, y^+2, • • • , y,} (backward time update). Associated with 
the variable £ is therefore an order index n and the time indices of the 
data used in the least-squares computation. If the data values 
{ y^ y^+u • • ■ » yi\ ar e used to compute £, then the indices io and i must 
be specified. This is in contrast to the prewindowed case where only i 
need be specified since io is always zero. 

Throughout the rest of this paper, the starting-time index of the 
generic parameter £ will appear as a subscript, and the current-time 
index will appear as a function argument. As an example ^(i | n) 
implies that the data values \y^, y^+i, ■ • • , y,j are used to compute 
the nth order variable £. The following variables are needed to derive 
the LS algorithms in the next section: 

1. Forward and backward prediction vectors [from (4)], 

f, (i | n) = <^l„-,,,-n„[Sm, n)Y^nJ\ (49a) 

and 

b, (i| n) m *-i„, |n [Sr(0, n - l)(2-"Y^ +n ,)]. (49b) 

2. Forward and backward prediction residual vectors, 

E ftio (i | n) = Y l0+ „, ( - S,(l, n)f, (i | n) (50a) 

and 
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Ew.tf | n) m 2 -"Y^,, - S,(0, n - l)b k (i \ n). (50b) 

3. Forward and backward prediction residuals (scalars), 

e fAt (i\n) m (Ui,E f Ji\n)) = y t - fj(i|*)yi-i|„ (51a) 

and 

eb, k (i | n) = <u„ H**(» | n)) = y,_„ - bj(i | w)y,|„. (51b) 

4. Forward and backward cost functions, 

t f Ji | n) m fl E /A (i | n) || 2 , e^tf | n) = || E 6iio (i | n) || 2 . (52) 

5. PARtial CORrelation (PARCOR) coefficient, 

KjJLi) s < E /.'o + i(^ I n - 1), E 6 , IO (i - 1 1 n - 1)>. (53) 

6. Auxiliary variables, or gains, 

gio+iO'N ■ *%+iMiiiy«|w (54a) 

K +1 (i\n) a ft^iay^nM (54b) 

7io + i(i|") ■ <u» Pm.^-dU.) = yl|»#5f^|«yi|n, (55a) 

yt+l(i\n) ■ (Uip PM.iO.n-liUio) = y£+n|n < ^ln,«|,,y«o+n|n, (55b) 

and 

Ctio+iUln) = (Ui,,, P M ,<0,n-l)Ui) = yS»*5+fMlny«»+n|i>' (55c) 

Notice that 

S,-(0, n - Dg^Ailn) = P M ffi*-vM (56a) 

and 

S,(0, n - l)\ +1 (i\n) = Pm^-dU^ (56b) 

and that 

7«o+i(» I ") " *&-i(* I n )ynn> (57a) 

7? +i(i I n) = hj+i(i | r0y*+n|», (57b) 

and 

aio+itf I «) = gj+i(» I «)y. +"in = hj + i(t I n)y,|„. (57c) 

Using the notation in the last section, the gains y and y* are, 
respectively, sin 2 0, and sin 2 0f, where 0, and 0,* are, respectively, the 
angles between M,(0, n - 1) and M,(0, rc - 1), and between M,(0, n - 
1) and Mf (0, n - 1). 
At each time instant our objective is to minimize the cost functions 
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cf^tiln) and e biio (i\n). From the discussion in Section III it follows 
that 

E ftio (i\n) = Pi, tihn) Y io+nii (58a) 

and 

E btio (i | n) = Pi/ l( o,n-i)(2"' , Y io+n , l ). (58b) 

The following variables, which are closely related to the prediction 
residuals, are also needed: 

E/,io(l I n) ■ ■f > Af 1 | 1 _ 1 (l,n)Yi 0+n ,j 

= Y l0+n ,, - S,-(l, nMJLi -l\n), (59a) 

and 

E£t(j|Fi) ■ Pt l|wW *-i)(*"*T<rHa) 

= 2- n Y, 0+n ,, - S,(0, n - \)h k (i - 1 1 n), (59b) 

i.e., E/and E b are the forward and backward residual vectors obtained 
by using prediction vectors computed at the previous time interval. 
The top components of E/,, (i | n) and E' b<ifJ {i \ n) are, respectively, 

ek (i\n) = (uuEfadln)) 

= y,-fj(i- l|n)yi-n n (60a) 

and 

e' b 4i\n) = (u,-,EWi|n)) 

= y,- n -bT (i- l|»)y,|„. (60b) 

The nth order forward prediction residual computed at time io + n 
using the tap vector f^(i | n) is 

e?,i d\n) m (uJB^(i|u)> 

= y-oH-" ~ ftf I »)y^,+n-l|n. (61) 

The forward residual vector at time i using the forward prediction 
vector calculated from the data samples fy, 0+ i, • • • , y,} is 

E/,. i<o+i(i|n.) = ■Pw lolHj+1 (i,f.)Yi 0+n ,, 

= Yw Si(l,n)f k+1 (i\n). (62) 

The variables et*>{i I n) and Eu»i%ti(i|n) are similarly defined. 

Notice that the time indices associated with a residual vector change 
in accordance with the projection space, i.e., 

^ > 3Br,(i,n-i)Yi 0+B ,i =E/ > j 0+ i(i|n — 1), (63a) 
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and 

Pitor-ito-Y+J) = ■*<* - 1 1 n - 1). (63b) 

The recursions needed to derive the algorithms in the next section 
are now generated systematically. By appropriately defining the vec- 
tors and subspaces entering the projection order update (22), order 
updates are derived for all of the basic variables defined by (49) through 
(55). We then use the forward and backward time updates (34) and 
(43) to obtain forward and backward time updates for the basic vectors 
defined by (49), (50), and (54). Finally, the forward and backward 
time updates for inner products (35) and (44) are applied to 
kn#>(i), *fjJLi\n)i and tb,kd\n). It would take up too much space to 
explicitly define the vectors and subspaces that must be substituted 
in the projection update used to derive each recursion. Consequently, 
only the results are stated, with a few representative examples worked 
out in more detail. 

5.2 Order updates 

The following order updates are obtained by using the projection 
order update (22) [or equivalent^ (21)]. The /th through the mth 
component of f, (i | n) is denoted by [f k (i | rc)],, m , and [f^(i | n)]j is the 
j th component of f^(i | n). The same notation is used for the backward 
prediction vector bj (i | n) and the gain vectors g,- (i | n) and hji | n). 

• E^i-lln-l), (64a) 

*fc«i») - *# - in - 1) - e/ jff.. 1} 

>E fM i(i\n - 1), (64b) 

(65a) 



,., > /-, ,\ «n,to(0 

W« I n) = **«<■ I n - 1) - £6 . ( -_ 1|n _ 1) ■ 
W*|n) = W " II"- 1) - R { i\n-l)> 



(65b) 



^'•'^' wAi'-D ' (66a) 

[tyi I n)]i^-i = fi 0+ id I » - 1) - Wi I ")]«M* - 1 1 » - 1). (66b) 
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^ ( ' |B)ll %Jff-ir < 67a > 

[KUl n)] 2 .„ = b,- (i - 1 1 n - 1) - [hfc(*|»)]jf,n.i(i|n - 1), (67b) 

[g lQ (i\n + l)] n+l = e -^^, (68a) 

[gi d I n + l)]i.„ = g,- 0+ i(i I n) - [g io (i | n + DWib^J | re), (68b) 

[&o(^ + l)].=^r|, (69a) 

[fto(i I n + D] 2 ,n + i = g IO (i - 1 1 re) - [g^i | re + l)\it k {i | re), (69b) 

[hfctf|n + l)W = £ ^£f^, (70a) 

W I re) 

[h, (i | re + l)] 1>n = h l0+1 (; | re) - [h, (i | re + l)]„ + ib fe (i | re), (70b) 

[hJ/lre+DJ^^^T, (71a) 

[h, (i | re + 1)] 2 .„ +1 = h^i - 1 1 re) - [Kd I n + l)]Afi I n), (71b) 

7, (*' 1 re + 1) = y k+1 (i | re) + C ^. ( ?)^ , (72a) 

y io (i | re + 1) = 7i0 (i - 1 1 n) + ^T7 » ( 72b > 

y*(i\n + 1) = tWi|») + e -~^, (73a) 

y#i\n+ 1) = 7 $(i - l|n) + ^j^, (73b) 

a h (i | re + 1) = ot io+1 (i | re) + —* .., . — , (74a) 

and 

«„(,- 1 n + 1) - «,« - 1 1 „) + e ^ ( ' l " )e ff |n) . (74b) 

As an example, (64a) is derived from (22), where M is replaced by 
M,(l, re - 1), X is replaced by z~ n Y^+„^ and Y is replaced by Y/ 0+n ,,. 
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By observing that E^i - 1 1 n - 1) is orthogonal to M,(l, n - 1), it 
is clear that 

(Y k+n , it E btio {i - l|n - 1)> = (E fM i(i\n - 1), 

.Ka*(i- l|n-l)) 

= KJi). (75) 

Recursions (65) are obtained by taking norms of (64) respectively. 
The recursions (68) through (71) are obtained from (21), where Y is 
replaced by u, and u*,,, respectively. Making the same substitutions in 
(21) and then taking inner products with u, or u^ gives recursions (72) 
through (74). 

5.3 Forward time updates 

The following forward time updates are obtained from the (orthog- 
onal) projection operator forward time update (34): 

Eufi I n) = E;, Io (; I n) - [P M ,«, n) u] 1 _ ^ l "\ : - } , (76a) 



E^d" | n) = Ktfc(j | n) - [P Mli0 .n-i)Ui] 1 _ r ^ l(l - |w) • (76b) 



eb(i I n ) 

7io+i< 



tffi\n) = t k (i -l\n) + g i0 (i -l\n) t _ g*!^ , „) 



(77a) 

b, (i | n) = b,(i - 1 1 n) + g k+l (i | n) l l b, ^^ n) . (77b) 

h io (i | n) = hfi ~l\n)- *(i | n) ^ "*^\ n) , (78) 

and 

**,(» I n) - yWM* - 1 1 n)[l - 7 fe(* I »)]. (80) 

Equation (78) is obtained from (34), where Af, is replaced Af,(0, n) and 
Y, is replaced by u^. Equations (79) and (80) are obtained by making 
the same substitutions in (34) and then taking inner products with 
u^ and u„ or by premultiplying (78) by yJ+„-n„ and yf\ n , respectively. 
Taking the inner product of (76) with u, and u^, respectively gives the 
following recursions: 
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•W* I "> = •&(» -H")- *fcti I n) a ^ (t '" 1| 1 ,> | . , (82a) 

1 - 7^(1 - 1 1 n) 



and 



•t*(i I n) = e?J - 1 1 n) - **(i | n) a ^ l \ n) . (82 b) 

5.4 Backward time updates 

The following backward time updates are obtained from the projec- 
tion operator backward time update (43): 



E f , io (i\n) = Efuwdln) - [fn^uj ^'"l, w 

1 — 7EU — In) 



(83a) 



E^(i | n) = E b , m+1 (i | n) - [P^-nuJ ^M , (83b) 

fio + i(i 1 1) = tkd I ») - M« " 1 1 n) : e; ? / ( ? |w) 1l ; , (84a) 

i — 7«oU — i|nj 

b^tf | n) = b^i | n) - K +1 (i I n) C ^ t|fl . ) , (84b) 

1 - 7«J+iU|l) 

gio (i | n) = g, 0+1 (i | n) - \(i | n) - ^Mg) ? (85) 

1 _ 7ioU|nJ 



^ (l,n) "^ l(,|n) -l- 7 «.-|n) 



(86) 



and 

<*i (i I n) = yJ + „-n„g^ + i(i | n)[l - yt(i | »)]. (87) 

Equation (85) is obtained by replacing Y, by u, in (43). Equations (86) 
and (87) are obtained by premultiplying (85) by yf\ n and yJ+ B -i|i», 
respectively. The following recursions are obtained by taking the inner 
product of (83) with u„ respectively: 

e fM1 (i | n) = e fJo (i \ n) + efji \ n) ^ (t " 1| 1 W | > , (88a) 
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and 



e bM1 (i I n) = e^ii | n) + efod I n) t ^j^j ■ (88b) 

The recursions that result from taking inner products with u^ will not 
be used and are therefore omitted. 



5.5 Inner product updates 

The following recursions are obtained from the forward time update 
for inner products (35): 

*■*(*) = W* - D + «/*«tf I " - l W* - 1 1 " - i) 

1 



1 - yio+iii - l|n- 1)' 
tfji I n) = e fAt (i ~ 1 1 n) + elfi I n) - _ . _ j j r) , 

and 

e 6A (i | n) = e^i - 1 1 n) + e(fc(i | n) - _ y . + ^^ n) • 

The following recursions are obtained from the backward time 
update for inner products (44): 

k n ,i (i) = k niio+1 (i) + ef tio+ i(i | n - l)e£*,(i - 1 1 n - 1) 

1 



(89) 
(90a) 

(90b) 



'l- 7 f 0+ i(;-l|"-l)' 

1 



(91) 
(92a) 

(92b) 



**(» I ") = */*«(* I ») + «W" I ») i - 7 *(j- l|n) ' 

and 

cwbd'ln) = tbfriUln) + et%(i\n) j ^T^J]^- 

Equations (89) and (91) are obtained by using (35) and (44), where v, 
is replaced by Y io+n ,„ Y,- is replaced by z" K Yt gtm j, and M, is replaced by 
M,(l, n - 1), respectively. The previous set of recursions (64) through 
(92) are complete in the sense that any existing least-squares alogrithm 
can be derived by manipulating suitable subsets of these recursions. 
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VI. RECURSIVE FIXED-ORDER COVARIANCE ALGORITHMS 
6. 1 Sliding-window algorithm 

Combining (60), (61), (68) through (73), (77), (81a), (84), (90a), and 
(92a), gives the following sliding-window LS algorithm for the predic- 
tion coefficients. Where unspecified, the order of the variable is 
assumed to be N, the order of the least-squares filter. Also, the starting 
time index is denoted as io. If the sliding window contains M data 
values, then io = i — M + 1, 

«/*(*') = yt - ff(» - Dy.-i. (93a) 

4,(0 = W* " 1) + *kfi ~ l)e/,. o (0, (93b) 

•utf) = efjoii)[l - ytoii - 1)1 (93c) 

«&(*) = y«+N ~ tfrfty&N-i, (93d) 

«/*,(*) = **(* " 1) + efadfajjii), (93e) 



g^liV+1)]^^, 



(93f) 



[g^i | AT - l)] aA+1 = 4(1 - 1) - [g^i | AT + 1)1^(0, (93g) 

[h fc (i|iV + l)] 1 -2&fei | (93h ) 

[h*(i| JV + l)Wi = M» " 1) " [Kd\N + DJifJi), (93i) 



iwo = «wo - M - 1) r7#rn . ( 93 j) 



(93k) 



eijjj.) = yi- N - bid - l)y„ (931) 

*V 1 ' ; , ,-vr — ,.,»,. ,,, , (93m) 

l-eUOfeoOIN+DWi 

efo,(») = y*o - bj(0yio+jv, (93n) 

S*«(i) = WP+Dki»+IW»^+l)li»«WJ), (93o) 

h*, +1 (i) = [h^iliV+Dhjv + fb^ilJV+lWxb^i), (93p) 

yio (i- 1) + e,, io (i)[g io (i\N+l)] 1 
„ d\ - -ej lo (i)[Si (i\N+ l)] N+l 

7 ' 0+l(l) " l-eUOte^liV+DW, ' (93q) 
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7 WO = yW - D + •MOWI^ + Dli 

- e^h^l TV +!)]„, (93r) 



and 



b, +1 (o = b 10 (o - h, +1 (o ;y ; . (93s) 



gyO 

YW 

The recursions (93m) and (93q) were not listed in the previous section, 
but are easily obtained by solving (77b) and (68b) simultaneously for 
b, (i), and by substituting (68a), (69a), and (81b) into (72), and solving 
for 7, 0+ i(i | n). Notice that all data samples in the sliding window 
(yi-M+i, .... yi) must be stored. This is also true of the order-recursive 
sliding-window algorithm presented in Ref. 14. If division is counted 
as multiplication, then the algorithm (93) requires 12N + 16 multiplies 
and 12N + 12 additions at each iteration. In contrast, the unnormal- 
ized sliding-window lattice predictor (see Appendix B) requires 16N 
multiplies and 1(W additions per iteration, and the normalized lattice 
predictor 16 requires 30iV multiplies, 18N additions, and 6AT square 
roots per iteration. 

Because sliding-window algorithms have finite memory, initializa- 
tion for these algorithms is basically the same as for the prewindowed 
case, i.e., the data y, can be assumed to be zero for i < 0. After M 
iterations, where M is the window length, these data points are 
discarded. The algorithm (93) is therefore initialized by setting the 
gains 7 and 7*, and the elements of the vectors f, b, g, and h equal to 
zero, and letting 

6/,, (0) = 5, (94) 

where 5 is chosen to ensure that the algorithm remains stable. It is 
easily verified that for time i < M - N - 1, where M is the length of 
the sliding window, the algorithm (93) becomes a modified version of 
the prewindowed LS transversal (fast Kalman) algorithm. 1,22 

6.2 Growing-memory covariance algorithm 

The following fixed-order growing- memory covariance algorithm is 
obtained by combining (60), (68b), (69), (77), (78), (80), (85), (87), and 
(90a). The lower index of the window i is assumed to be zero. For 
notational convenience we define the following variables, 

and 
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Wli.)- .'*"'.! r 05b) 

Where unspecified, the lower time index and the order of the variables 
are equal to zero and N, respectively,* 

efii) = yi = f(i - l)y,-i, (96a) 

t(i) = t(i - 1) + g(i - l)e/(0, (96b) 

ef{i) = y, - f T (0y«-i, (96c) 

efii) = efii - 1) + e f ii)efii), (96d) 

[e(i\N+l)h = ^ y (96e) 

[g(i 1 N + 1)] 2 .n + . = g(i - 1) - [g(i | N + l)]if(i), (96£) 

e*(i) = y,- N - b T ii - l)y„ (96g) 

1 - e b (i)[fS(i\N + DWi 

gi(0 = [gii\N+l)] liN +[gii\N+l)] N+ Mi), (96i) 

0(0 = yfhtf - 1), (96j) 

0*(i) = yJC-igi(i), (96k) 



and 



, A gi(0 - gnOhg - l) , Qfin 



h(i) = h(i - 1) - |8(i)g(i). (96m) 



Notice that this algorithm can be applied only if i > iV. Otherwise, 
the least-squares variables of order N are undefined and cannot be 
used to compute the same least-squares variables at the successive 
time interval. Initialization of this algorithm can be performed, how- 
ever, by using an order-recursive algorithm for i < N to increase the 
order of the filter by one at each successive time iteration. An order- 
recursive algorithm for the prediction coefficients is obtained by 
combining (89), (66a), (67a), top components of (64a) and (64b), (66b), 
(67b), (65a) and (65b), (82a), (88a), (84a), (92a), (71), (73b), (72), and 
(74b). This algorithm is basically the same as the covariance lattice 



* The author recently discovered that this algorithm has been independently derived 
in Ref. 23 using an algebraic approach. 
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algorithm presented in Refs. 7 and 19, except that additional order 
recursions for the prediction vectors f and b have been added. It is 
not explicitly stated in an effort to conserve space. Order-recursive 
computation of f and b requires order N 2 arithmetic operations per 
iteration, rather than order N operations per iteration, as required by 
the fixed-order algorithm. Not all N components of the vectors f and 
b need to be updated at each iteration for i < N, however. If data is 
first received at time i = 0, the recursions listed above can be used for 
n = up to n = i. At time i = N all of the variables that enter the 
fixed-order algorithm (96) have been computed by the order-recursive 
algorithm except for g(i'), 0(i) and 0*(i). The gain g(i) is the only 
variable needed at the next iteration of the fixed-order algorithm and 
can be computed by first using (96j) to calculate 0(i) and then using 
(96m) to solve for g(i). 

Derivation of initial conditions for the order-recursive initialization 
routine is significantly more complicated than for the sliding-window 
algorithm. This is because for i = n, the matrix <J>„,,|„ is guaranteed to 
be singular, and hence all variables are technically undefined. Refer- 
ence 14 gives a convenient solution to this startup problem. By using 
a generalized inverse of a singular or nonsingular matrix, the least- 
squares projection operator P, given by (15), can be defined even when 
the matrix S r S is singular. If this generalized inverse is defined 
appropriately, it can be shown that the projection updates in Section 
IV hold even when the covariance matrix is singular. This implies that 
all of the recursions listed in the last paragraph that constitute the 
order-recursive initialization routine can be used starting from i = 
with the following initial conditions: 

f(0 1 0) = b(0 1 0) = t\(-l 1 0) = h(-l 1 0) = 0, (97a) 

k n (-l) = 0, 1 *£ n *£ N, (97b) 

T (-l 1 0) = 7*(-l I 0) = a(-l | 0) = 0, (97c) 



and 



•ww-1? t :ii (97d > 



At each iteration i < N, 

e f (i\0) = e b (i\0)= yi (97e) 

and 

e/(i 1 0) = e b (i 1 0) = e f (i - 1 1 0) + yl (97£) 

Counting division as multiplication, the algorithm (96) requires 
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11 AT + 7 multiplies and UN + 1 additions per iteration. To compare, 
the unnormalized growing-memory lattice predictor (see Appendix B) 
requires 22N multiplies and 12N additions per iteration. The normal- 
ized lattice algorithm requires 30N multiplies, 18N additions, and 6N 
square roots per iteration. We point out that the fixed-order co variance 
algorithm specified by (96) is not unique. In particular, equation (96c) 
can be replaced by (81a). The extra recursion (93q) must be added to 
compute 7, however. This type of modification has been applied to the 
fast Kalman algorithm, and has resulted in improved numerical prop- 
erties. 22 



VII. EXTENSIONS TO JOINT-PROCESS ESTIMATION 

The algorithms presented so far solve the LS prediction problem 
wherein the sums (1) and (2) are minimized. In applications such as 
channel equalization, echo and noise cancellation, and adaptive line 
enhancement, two processes, [xj\ and [yj], are given, and our objective 
is to estimate the \xj\ process in terms of the [yj] process. The vector 
of estimation errors is denoted as 



n-l 

E Xiio+ i(i\n) = X io+n ,i - £ c j+ u n (z~ j Y, 



io+n,i) 

= X^ - S,(0, n - l)<Vi(i | n), (98) 

where X^,, is defined by (23a), c^+iii \ n) is the rc-dimensional vector 
of regression coefficients at time i used to estimate X^+nj [given by 
(4c)] where i' = io + n), and the lower time subscript of E x and c 
denotes the time index of the starting value from the v sequence (i.e., 
yio+i), which is used in the least-squares computation. Our objective is 
to choose Gfc+i(( | n) such that 

Wi(<» - IIE^d'ln)!! 2 (99) 

is minimized. The discussion in Section III implies that 

E Ztio+l (i | n) = Pfco.n-nX^,,-. (100) 

We now use the projection recursions in Section IV to derive order 
and time updates for E Xi ^(i | n) and c^(i | n). Details are again omitted 
since they are basically the same as before. Combining recursions in 
this section with the prediction algorithms of the last section results 
in recursive algorithms that solve the LS joint-process-estimation 
problem. 

The following notation, which is analogous to the notation in 
Section 5.1, is first defined: 

1. Cross-correlation coefficient, 
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fc&ACO-W^lWilii)) 

= (E x , io+1 (i\n),E b , il) (i\n)). (101) 

2. Current residual (scalar), 

e x ^(i\n) = <u„ Exhdln)) 

-«-<(i| iibrii.. (102) 

3. Past residual (scalar), 

e£frti(i|n) = < u *> E^+iO'ln)) 

= x^+n - cj + i(i | n)yfrt*|ii* (103) 

4. Oblique residual 

e^i | n) - xi - cj(i - 1 1 n)y,m. (104) 

The following order recursions are obtained from (22): 

B*(i | » + 1) - E x * +1 (i | n) - *%j§ **(< | ii), (105) 

<*«> + » = w »N- SlS' <106) 

w iin+ *'=SiS' (107a) 

and 

K(i | n + D]i,n = <Vi(i I ») - W* I n + Dl+fafi I »>• < 107b > 

Derivation of the following forward time updates involves a straight- 
forward application of (34) and (35), where Y, is replaced by X^ 
and Mi is replaced by M,(0, n — 1): 

Ci.d I n) = *fi - 1 1 ») + eW* I »)*«.<» I »). ( 108 > 

fe£W*) = kJHufi ~ 1) + ***i(" I »Hfc<« I ») i-^^ilw) ' (1 ° 9) 

oc«o(t I n) 

and 



4*(l I n) = **<i - 1 1 n) + elji \ n) 1 _^ ( - |r|) . ( uo ) 

<*(i I ») = «M* -H")- W« I "> i !^("l ») ' (111) 
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^■iS- (112) 



Similarly, the following backward time updates are obtained from (43) 
and (44): 

c k (i\n) = c io+1 (i\n) + \d\ri) - l*^^ , (H3) 

**«(* I *) = e, A (i | r») + e&(* I ») t -^i(?j„) • < 114 > 

fe&i*+i(i) = k&uj® ~ «W* I »W I n) 1 ... . , (115) 

and 

c«A+i(i I ») = €«,^(i | n) - e*%{i | re) — — - . (116) 

Combining (104), (108), (103), and (113) (in that order) with the 
fixed-order sliding-window algorithm (93) gives the corresponding 
sliding-window joint-process-estimation algorithm. Adding these ad- 
ditional recursions results in a total computational complexity of 
16N + 17 multiplies and 16N + 13 additions per iteration. This should 
be compared with 23iV multiplies and 14iV additions per iteration 
required by the unnormalized sliding-window lattice joint-process 
estimator. Initialization of these additional recursions is accomplished 
in a fashion analogous to the prediction recursions. In particular, the 
data y, and x, is assumed to be zero for i < 0, and c^— 1 1 n) = 0. 

The fixed-order growing-memory algorithm (96) is extended to the 
joint-process-estimation case by adding the recursions (104) and (108). 
The order-recursive prediction algorithm listed in Section 6.2 is ex- 
tended to the joint-process-estimation case by adding the recursions 
(105) (top component only), (109), (107), (111), and (113). In each 
case the variable io = 0. Adding (104) and (108) to (96) results in a 
total computational complexity of 1SN + 7 multiplies and 13AT + 1 
additions per iteration. This should be compared with 28iV multiplies 
and 167V additions per iterations required by the growing-memory 
covariance lattice joint-process estimator. The following accomplishes 
the initialization of the additional recursions for the order-recursive 
algorithm: 

fcM-1) = 0, 1 ^ n «£ N, (117a) 

c(-l | n) = 0, =£ n *= N, (117b) 
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and 

[0 for m > 0. 

The fixed-order algorithm is initialized by using the order-recursive 
algorithm for i < N. 

VIII. CONCLUSIONS 

We have presented new fixed-order algorithms that recursively solve 
the sliding-window and growing-memory covariance least-squares es- 
timation problems. The fixed-order growing-memory algorithm re- 
quires approximately one half the number of multiplies and divides 
required by the analogous unnormalized order-recursive or lattice 
algorithm. The fixed-order sliding-window algorithm requires approx- 
imately 70 percent of the number of multiplies and divides required 
by the analogous lattice algorithm. These fixed-order algorithms also 
help complete the list of computationally efficient LS algorithms 
currently available. In particular, each type of windowing technique 
that has been proposed for the LS computation (i.e., prewindowed, 
growing- memory covariance, and the sliding window) has resulted in 
both computationally efficient fixed-order and order-recursive algo- 
rithms. The order-recursive algorithms offer the advantage of being 
able to dynamically choose the order of the autoregressive model, while 
the fixed-order algorithms require less computation. 

Associated with the algorithms mentioned in this paper are per- 
formance issues such as the relative convergence speed of each algo- 
rithm given different types of stationary and nonstationary random 
inputs, and the evaluation of finite word-length effects. As an example, 
the relative performance improvement offered by LS covariance al- 
gorithms over LS prewindowed algorithms has yet to be ascertained 
in applications where the prediction coefficients must be estimated 
from relatively few data samples. These issues will play a crucial role 
in determining the practical value of the LS algorithms presented in 
this paper. 
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APPENDIX A 
Derivation of (30) 
We wish to prove (30). By definition, 

P MiU _Ji = PmJi + PvJPm^Yi, (118) 

where M is the subspace spanned by the column vectors of S,. Pro- 
jecting both sides of (118) onto M, gives 

P Mi P Mi ^i = Pu-Pa^i + P Mt Pu^M t[ M Y|. (119) 

Now Pj»/. |i _,Yi lies in M„ and hence 

Pw.Pji^Yi = Pw i|w Y,. (120) 
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Also, 

pj4p*,y< = s,(srs l )- i (srs,)(srs,)- i srY 1 

= P Mi .(Y, - Pt/Y,). (121) 

Combining (118) through (121) gives 

P M Yi = P MilM Yi + PjiAAhYj 

= Pjy^Y, + (Pif^)<U|, Pfc, w Yi>. (122) 

Subtracting both sides of (122) from Y„ and then taking inner products 
of both sides with u, gives 

<u„ PhYi) = <u„ P* l|M Y f >[l - <u„ P M xti)]. (123) 

Combining (122) and (123), and using the definition (31) gives (30). 
[Ref. 24 gives a purely geometric proof of (30) for the case where M, 
is spanned by one vector (as illustrated in Fig. 2).] 
To derive the inner product update (35), we first rewrite (34) as 

PhYi = PfcPfc, H Yi + iyfc^Y, - Pj4"K«* PfcYi>secty 

= PifcPiYj + u f <Ui, Pfc^Yj) - Pm.u.Xu,, PfcY.-Jsecfy 

= Pbtfpt + PhpAUk PfcYi>Becrti. (124) 

Taking the inner product of both sides with v, and using the fact that 
<v„ Phui) = <u„ Pfev.) (125) 

gives (35). 

APPENDIX B 

Other Recursive Least-Squares Algorithms 

The recursions in Section V and VII are complete in the sense that 
any of the existing computationally efficient LS prediction or joint- 
process-estimation algorithms can be derived from suitable subsets of 
these recursions. The purpose of this appendix is to illustrate this 
point by listing the recursions that enter the prewindowed LS trans- 
versal (fast Kalman) and lattice algorithms, the unnormalized sliding- 
window and growing-memory covariance lattice algorithms, 14 and the 
nonrecursive LS algorithm presented in Refs. 11 and 17. The list of 
recursions presented below does not completely describe each algo- 
rithm. For example, initialization is not discussed. Consistent steady- 
state algorithms can be formulated, however, by choosing the time 
indices and order of the variables in each recursion appropriately. The 
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following algorithms apply to the more general joint-process-estima- 
tion case (eliminating the recursions from Section VII gives the 
analogous prediction algorithm): 

1. Prewindowed transversal (fast Kalman) algorithm: 1 (60a), (77a), 
(51a), (90a), (69), (60b), (93m), (68b), (104), and (108). 

2. Prewindowed lattice algorithm* (See Refs. 18 and 16): (89), (64a) + 
and (64b), (65a) and (65b), (72a), (109) and (105). 

3. Sliding-window lattice algorithm: 1416 (89), (64a) and (64b), (65a) 
and (65b), (72a), (73a), (91), (109), (105), and (115). 

4. Growing-memory covariance lattice algorithm: 14,16 (89), (64a) and 
(64b), (65a) and (65b), (82a), (88a), (92a), (72b) and (72a), (73b), (74b), 
(109), (105), (111), and (114). 

5. Nonrecursive LS algorithm: 11,17 

The following set of recursions, which represents a modified version 
of the algorithm presented in Ref. 17, can be used to compute f(i | N), 
b(i | N), and c(f | N), given by (4), in an order-recursive fashion starting 
with first-order least squares variables at time i. Initialization consists 
of computing these first-order variables via the definitions given in 
Section V. 

(78) (for computing h(i - 1 1 n)), (85) (for computing gi(i | n)), (79), 
(86), (84a), (77b), (92a), (90b), (53), (66), (67), (65a) and (65b), (51b), 
(61), (68), (71), (57c), (72a), (73b), (103), (113), (101), (107). 

Assuming that the covariance matrix $n,hn+i has been computed, a 
more convenient form for (53) is 

k n (i) = <Y„,-, E fc (i- 1|»-D) 

= YlfcT* " SKI, n - l)b(i - 1 1 n - 1)] 

n-l 

= Rn - I R n -j{b(i - 1 1 n - 1)],, (126) 

where Rj = Yj,,(2~ 7 Y„,,), and is the (1, j + l)st element of 4> n ,,| n+ i. 
Equation (101) can be similarly modified. 
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* The normalized lattice algorithms presented in Refs. 13 and 14 can be obtained by 
making substitutions for the variables entering the unnormalized recursions presented 
here. 16 -"* 

f In the prewindowed and growing-memory covariance cases, the inner products of 
(64) and (105) with u, are required. In the sliding-window case, the inner products of 
(64) and (105) with both u, and u^, are required. 
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